在这份技术报告中,我们介绍了数字写作助手(高效且智能编辑),该助手通过使用人工智能(AI)技术来促进用户更有效地编写更高质量的文本。以前的写作助理通常提供错误检查的功能(以检测和纠正拼写和语法错误)和有限的文本练习功能。随着大型神经语言模型的出现,一些系统支持自动完成句子或段落。在Effidit中,我们通过提供五个类别的功能来显着扩展写作助手的能力:文本完成,错误检查,文本抛光,关键字到句子(K2S)和云输入方法(Cloud IME)。在文本完成类别中,Effidit支持基于生成的句子完成,基于检索的句子完成和短语完成。相比之下,到目前为止,许多其他写作助理仅提供三个功能中的一两个。对于文本抛光,我们具有三个函数:(上下文感知)短语抛光,句子释义和句子扩展,而其他许多写作助手通常会在此类别中支持一两个功能。本报告的主要内容包括象征的主要模块,实施这些模块的方法以及一些关键方法的评估结果。
translated by 谷歌翻译
当前的个性化对话中的作品主要有助于代理人表现出一致的个性并推动更有用的回应。但是,我们发现大多数以前模型的生成的响应往往是以自我为中心的,对话中的用户几乎不关心。此外,我们认为类似人类的对话基本上是基于推断另一方角色的信息而构建的。由此激励,我们通过检测隐性用户角色提出了一种新颖的个性化对话生成器。因为很难为每个用户收集大量详细的角色,所以我们试图对用户的潜在角色及其从对话历史记录进行建模,而没有外部知识。使用条件变异推断对感知和推子变量进行了构想。两个潜在变量模拟了人们意识到彼此角色并在对话中产生相应表达的过程。最后,提出了后歧视的正规化以增强训练程序。实证研究表明,与最先进的方法相比,我们的方法更关心用户的角色,并在整个评估中实现了相当大的推动力。
translated by 谷歌翻译
在这项工作中,介绍了一个内核注意模块,用于通过神经网络进行基于脑电图的情绪分类的任务。所提出的模块通过执行内核技巧来利用自我发挥的机制,要求比标准注意模块更少的可训练参数和计算。该设计还为定量检查深度精炼中分配的注意力的量提供了标量,因此有助于更好地解释训练有素的模型。使用EEGNET作为骨干模型,与其他SOTA注意模块相比,在种子数据集上进行了广泛的实验,以评估模块内部主体内分类任务的性能。仅需要一个额外的参数,插入的模块被证明可以将基本模型的平均预测精度提高到15个受试者的1 \%以上。该方法的一个关键组成部分是解决方案的解释性,该解决方案使用几种不同的技术来解决,并作为依赖性分析的一部分包含在整个过程中。
translated by 谷歌翻译
在这项工作中,使用有限或相对较少数量的脑电图(EEG)信号提出了一个有效的注意力模块,用于情绪分类。该模块被称为单调性限制的注意模块(MCAM),因为它在将特征的革兰氏矩阵转换为注意矩阵以获得更好的特征细化时,可以将先验纳入单调性上。我们的实验表明,MCAM的有效性可与最新的注意模块相媲美,这在提高骨干网络的预测性能时,同时需要更少的参数。还对受过训练的模型的有关不同攻击的预测进行了几项伴随的敏感性分析。这些攻击包括各种频域过滤水平和与多个标签相关的样品之间逐渐变形。我们的结果可以帮助更好地理解预测中不同模块的行为,并可以在数据有限且存在噪音的应用程序中提供指导。
translated by 谷歌翻译
在近期深度图像压缩神经网络中,熵模型在估计深度图像编码的先前分配时起着重要作用。现有方法将HydupRior与熵估计功能中的本地上下文组合。由于没有全球愿景,这大大限制了他们的表现。在这项工作中,我们提出了一种新的全局参考模型,用于图像压缩,以有效地利用本地和全局上下文信息,导致增强的压缩率。所提出的方法扫描解码的潜伏,然后找到最相关的潜伏,以帮助分布估计当前潜伏。这项工作的副产品是一种平均转换GDN模块的创新,进一步提高了性能。实验结果表明,所提出的模型优于行业中大多数最先进方法的速率变形性能。
translated by 谷歌翻译
In this paper, we propose a robust 3D detector, named Cross Modal Transformer (CMT), for end-to-end 3D multi-modal detection. Without explicit view transformation, CMT takes the image and point clouds tokens as inputs and directly outputs accurate 3D bounding boxes. The spatial alignment of multi-modal tokens is performed implicitly, by encoding the 3D points into multi-modal features. The core design of CMT is quite simple while its performance is impressive. CMT obtains 73.0% NDS on nuScenes benchmark. Moreover, CMT has a strong robustness even if the LiDAR is missing. Code will be released at https://github.com/junjie18/CMT.
translated by 谷歌翻译
Knowledge graphs (KG) have served as the key component of various natural language processing applications. Commonsense knowledge graphs (CKG) are a special type of KG, where entities and relations are composed of free-form text. However, previous works in KG completion and CKG completion suffer from long-tail relations and newly-added relations which do not have many know triples for training. In light of this, few-shot KG completion (FKGC), which requires the strengths of graph representation learning and few-shot learning, has been proposed to challenge the problem of limited annotated data. In this paper, we comprehensively survey previous attempts on such tasks in the form of a series of methods and applications. Specifically, we first introduce FKGC challenges, commonly used KGs, and CKGs. Then we systematically categorize and summarize existing works in terms of the type of KGs and the methods. Finally, we present applications of FKGC models on prediction tasks in different areas and share our thoughts on future research directions of FKGC.
translated by 谷歌翻译
Few Shot Instance Segmentation (FSIS) requires models to detect and segment novel classes with limited several support examples. In this work, we explore a simple yet unified solution for FSIS as well as its incremental variants, and introduce a new framework named Reference Twice (RefT) to fully explore the relationship between support/query features based on a Transformer-like framework. Our key insights are two folds: Firstly, with the aid of support masks, we can generate dynamic class centers more appropriately to re-weight query features. Secondly, we find that support object queries have already encoded key factors after base training. In this way, the query features can be enhanced twice from two aspects, i.e., feature-level and instance-level. In particular, we firstly design a mask-based dynamic weighting module to enhance support features and then propose to link object queries for better calibration via cross-attention. After the above steps, the novel classes can be improved significantly over our strong baseline. Additionally, our new framework can be easily extended to incremental FSIS with minor modification. When benchmarking results on the COCO dataset for FSIS, gFSIS, and iFSIS settings, our method achieves a competitive performance compared to existing approaches across different shots, e.g., we boost nAP by noticeable +8.2/+9.4 over the current state-of-the-art FSIS method for 10/30-shot. We further demonstrate the superiority of our approach on Few Shot Object Detection. Code and model will be available.
translated by 谷歌翻译
Graph Neural Networks (GNNs) have shown satisfying performance on various graph learning tasks. To achieve better fitting capability, most GNNs are with a large number of parameters, which makes these GNNs computationally expensive. Therefore, it is difficult to deploy them onto edge devices with scarce computational resources, e.g., mobile phones and wearable smart devices. Knowledge Distillation (KD) is a common solution to compress GNNs, where a light-weighted model (i.e., the student model) is encouraged to mimic the behavior of a computationally expensive GNN (i.e., the teacher GNN model). Nevertheless, most existing GNN-based KD methods lack fairness consideration. As a consequence, the student model usually inherits and even exaggerates the bias from the teacher GNN. To handle such a problem, we take initial steps towards fair knowledge distillation for GNNs. Specifically, we first formulate a novel problem of fair knowledge distillation for GNN-based teacher-student frameworks. Then we propose a principled framework named RELIANT to mitigate the bias exhibited by the student model. Notably, the design of RELIANT is decoupled from any specific teacher and student model structures, and thus can be easily adapted to various GNN-based KD frameworks. We perform extensive experiments on multiple real-world datasets, which corroborates that RELIANT achieves less biased GNN knowledge distillation while maintaining high prediction utility.
translated by 谷歌翻译
This paper focuses on designing efficient models with low parameters and FLOPs for dense predictions. Even though CNN-based lightweight methods have achieved stunning results after years of research, trading-off model accuracy and constrained resources still need further improvements. This work rethinks the essential unity of efficient Inverted Residual Block in MobileNetv2 and effective Transformer in ViT, inductively abstracting a general concept of Meta-Mobile Block, and we argue that the specific instantiation is very important to model performance though sharing the same framework. Motivated by this phenomenon, we deduce a simple yet efficient modern \textbf{I}nverted \textbf{R}esidual \textbf{M}obile \textbf{B}lock (iRMB) for mobile applications, which absorbs CNN-like efficiency to model short-distance dependency and Transformer-like dynamic modeling capability to learn long-distance interactions. Furthermore, we design a ResNet-like 4-phase \textbf{E}fficient \textbf{MO}del (EMO) based only on a series of iRMBs for dense applications. Massive experiments on ImageNet-1K, COCO2017, and ADE20K benchmarks demonstrate the superiority of our EMO over state-of-the-art methods, \eg, our EMO-1M/2M/5M achieve 71.5, 75.1, and 78.4 Top-1 that surpass \textbf{SoTA} CNN-/Transformer-based models, while trading-off the model accuracy and efficiency well.
translated by 谷歌翻译